Overview
What is Apache Hive?
Apache Hive is database/data warehouse software that supports data querying and analysis of large datasets stored in the Hadoop distributed file system (HDFS) and other compatible systems, and is distributed under an open source license.
With Apache Hive, you can enter the world of Big Data
Best Distributed Database in the market
Help your dev team !
Spectacular SQL-like interface for accessing Hadoop
This system makes active data of value.
Best query platform for ETL.
It is an advance to the ease of the processes
Capabilities of Apache Hive
Excellent bigdata warehouse solution
Our use …
very useful for OLTP
Apache Hive
Walk into the World of Big Data with Apache Hive
Reliable and Cheaper one stop Data warehouse solution
Big Data the SQL way
Apache Hive: Big data querying tool w/SQL interface, but slower, more costly computation
Awards
Products that are considered exceptional by their customers based on a variety of criteria win TrustRadius awards. Learn more about the types of TrustRadius awards to make the best purchase decision. More about TrustRadius Awards
Pricing
What is Apache Hive?
Apache Hive is database/data warehouse software that supports data querying and analysis of large datasets stored in the Hadoop distributed file system (HDFS) and other compatible systems, and is distributed under an open source license.
Entry-level set up fee?
- No setup fee
Offerings
- Free Trial
- Free/Freemium Version
- Premium Consulting/Integration Services
Would you like us to let the vendor know that you want pricing?
24 people also want pricing
Alternatives Pricing
What is ClicData?
ClicData is a 100% cloud-based business intelligence platform that allows users to connect, process, blend, visualize and share data from a single place. As an automated platform, users are able to rely on the latest version of company data, to ensure users make the right decisions. Hundreds of…
What is retailMetrix?
RetailMetrix is a data analytics platform for retailers with the mission of enabling retailers to get value from their data. RetailMatrix processes and stores sales, labor and customer data using data warehouse technologies. Its dashboards and reports allows team to find the data that matters to…
Product Demos
Apache Hive Hadoop Ecosystem - Big Data Analytics Tutorial by Mahesh Huddar
Connecting Microsoft Power BI to Apache Hive using Simba Hive ODBC driver
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Product Details
- About
- Tech Details
- FAQs
What is Apache Hive?
Apache Hive Technical Details
Operating Systems | Unspecified |
---|---|
Mobile Application | No |
Frequently Asked Questions
Comparisons
Compare with
Reviews and Ratings
(97)Community Insights
- Business Problems Solved
Apache Hive is a versatile software that has been widely used across various departments and organizations for different use cases. It has proven to be particularly helpful in handling large datasets, migrating data between different operating systems, synchronizing programs, and fetching and generating product metrics. Users have found value in using Hive for data analytics, engineering, data science, product management, and IT-related tasks such as improving analysis of big datasets stored in Hadoop HDFS.
Furthermore, Apache Hive has simplified the process of filtering and cleaning data using SQL, reducing the learning curve for handling big data. It allows users to run SQL queries against data in Hadoop, enabling efficient analysis of large datasets without the need to learn a new language. Additionally, Hive has been utilized for building reports, analyzing data stored in the Hadoop file system, processing events gathered in HDFS, and converting them into parquet files for fast querying.
Overall, users have praised Apache Hive for its scalability, accessibility, and cost-effectiveness in storing and retrieving analytics data. It has provided an intuitive solution for storing large datasets, querying big sets of data using SQL, aggregating massive datasets into distilled information for data-driven decision making, and creating external and internal tables in Hadoop/BigData projects. With its ability to process both unstructured and structured data efficiently, Hive has become an essential tool for data analysts, engineers, and business analysts across organizations.
Attribute Ratings
Reviews
(1-8 of 8)Help your dev team !
- Simplify query to devs
- Organize data
- Batch process
- Deploy
- Maintenance
- Support
It is an advance to the ease of the processes
- The unification of the data will help to establish the commercial criteria.
- We are sure that the data is protected
- If you try to extract an excessive amount of data, the system will become slow
- You may have the danger that the system collapses due to the amount of data
Excellent bigdata warehouse solution
Our use case/scope is to work on a large data analytics project where the data frequency and velocity are very high. Apache Hive is very useful in processing both the unstructured and structured data in a seamless way. It help us in reducing to write complex queries as it is targeted to the SQL queries, we have a engineer team who are very proficient in writing SQL queries with the help of Apache Hive to process the big data.
We have identified no business issues using the solution.
- Apache Hive supports external data tables.
- Supports data partitioning to improve overall performance.
- Apache hive is reliable and scalable solution.
- Apache Hive supports writing ad-hoc queries as well.
- Apache hive is not best suited for OLTP based jobs.
- Sometimes we observed high latency rate while querying data.
- Limitations on providing row-level data update.
- Training materials needs improvements.
The Metastore, is used for storing metadata for each table and its schema. The Driver operates as a controller for executions of the statements. Like other components such as Optimizer and CLI, Thrift Server are some components that enable the processing of big data transformation.
Apache Hive: Big data querying tool w/SQL interface, but slower, more costly computation
- Flexibility through schema on read
- Familiar SQL like query language
- Functions for complex queries and analysis
- Slower processing than other tools on the market
Manage data for your warehouse as strong as a beehive using Apache HIve!
It was one of those technical sessions and I was supposed to demonstrate a word count program of a novel downloaded from the Project Gutenberg. I was successfully able to download the novel, load it into the Hadoop platform and execute a HiveQL (a SQL similar syntax used by Apache Hive) query to demonstrate for few unique words, their count, and related examples.
- The capability to handle large amounts of data and its querying process.
- A syntax similar to SQL is an added advantage.
- An active developer support and community always ready to help.
- Ease of usage.
- Resource consuming sometimes. May be that I was using a larger object file.
- Needs to add an update or a modify functionality. This has to be the minimilastic CRUD requirement.
The only underlying problem could be that the Apache Hive is designed to run on the Apache Hadoop ecosystem. People who are not comfortable using a Linux tree structure based File System or even people who are not likely to use a Linux OS might not like to use Hive.
Apache Hive: SQL, open-source querying tool
- Monitor query performance
- Manage tables in the data warehouse
- Uses standard SQL
- UI is quite dated and not intuitive
- Open-source, so does not have consistent updates or support
- Not the most optimal for ETL processes
Apache Hive Review
- SQL like query engine, allows easy ramp up from a standard RDBMS
- Scalability is great
- If properly configured the data retreival is fantastic
- The way we currently have it implemented is quite slow, but I believe that's more of our implementation
- Joins tend to be slow
Hive, a very powerful open source data warehouse solution.
- Partition to increase query efficiency.
- Serde to support different data storage format.
- Integrate well with Impala and data can be queried by Impala.
- Support of parquet compression format
- Speed is slower compared to Impala since it uses map reduce